Finding Approximate Tandem Repeats with the Burrows-Wheeler Transform

نویسندگان

  • Agnieszka Danek
  • Rafał Pokrzywa
چکیده

Approximate tandem repeats in a genomic sequence are two or more contiguous, similar copies of a pattern of nucleotides. They are used in DNA mapping, studying molecular evolution mechanisms, forensic analysis and research in diagnosis of inherited diseases. All their functions are still investigated and not well defined, but increasing biological databases together with tools for identification of these repeats may lead to discovery of their specific role or correlation with particular features. This paper presents a new approach for finding approximate tandem repeats in a given sequence, where the similarity between consecutive repeats is measured using the Hamming distance. It is an enhancement of a method for finding exact tandem repeats in DNA sequences based on the BurrowsWheeler transform. Keywords—approximate tandem repeats, Burrows-Wheeler transform, Hamming distance, suffix array

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Approximate Pattern Matching Using the Burrows-Wheeler Transform

The compressed pattern matching problem is to locate the occurrence(s) of a pattern P in a text string T, using a compressed representation of T, with minimal (or no) decompression. In this paper, we consider approximate pattern matching on the text transformed by the Burrows-Wheeler Transform (BWT). This is an important first step towards developing compressed pattern matching algorithm for BW...

متن کامل

Output distribution of the Burrows - Wheeler transform ' Karthik

The Burrows-Wheeler transform is a block-sorting algorithm which has been shown empirically to be useful in compressing text data. In this paper we study the output distribution of the transform for i.i.d. sources, tree sources and stationary ergodic sources. We can also give analytic bounds on the performance of some universal compression schemes which use the Burrows-Wheeler transform.

متن کامل

Improvements to the Burrows-Wheeler Compression Algorithm: After BWT Stages

The lossless Burrows-Wheeler Compression Algorithm has received considerable attention over recent years for both its simplicity and effectiveness. It is based on a permutation of the input sequence − the Burrows-Wheeler Transform − which groups symbols with a similar context close together. In the original version, this permutation was followed by a Move-To-Front transformation and a final ent...

متن کامل

FUNCTIONAL PEARLS Inverting the Burrows-Wheeler Transform

Our aim in this pearl is to exploit simple equational reasoning to derive the inverse of the Burrows-Wheeler transform from its specification. We also outline how to derive the inverse of two more general versions of the transform, one proposed by Schindler and the other by Chapin and Tate.

متن کامل

Attacking Scrambled Burrows-Wheeler Transform

Scrambled Burrows-Wheeler transform [6] is an attempt to combine privacy (encryption) and data compression. We show that the proposed approach is insecure. We present chosen plaintext and known plaintext attacks and estimate their complexity in various scenarios.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012